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A number of authors have suggested that nonlinear interactions can enhance resolution of phase 
shifts beyond the usual Heisenberg scaling of 1/n, where n is a measure of resources such as the 
number of subsystems of the probe state or the mean photon number of the probe state. These 
suggestions are based on calculations of 'local precision' for particular nonlinear schemes. However, 
we show that there is no simple connection between the local precision and the average estimation 
error for these schemes, leading to a scaling puzzle. This is partially resolved by a careful analysis 
of iterative implementations of the suggested nonlinear schemes. However, it is shown that the 
suggested nonlinear schemes are still limited to an exponential scaling in *Jn. (This scaling may be 
compared to the exponential scaling in n which is achievable if multiple passes are allowed, even for 
linear schemes.) The question of whether nonlinear schemes may have a scaling advantage in the 
presence of loss is left open. Our results are based on a new bound for average estimation error that 
depends on (i) an entropic measure of the degree to which the probe state can encode a reference 
phase value, called the G-asymmetry, and (ii) any prior information about the phase shift. This 
bound is asymptotically stronger than bounds based on the variance of the phase shift generator. 
The G-asymmetry is also shown to directly bound the average information gained per estimate. Our 
results hold for any prior distribution of the shift parameter, and generalise to estimates of any shift 
generated by an operator with discrete eigenvalues. 
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I. INTRODUCTION 

In many measurement scenarios, an environmental 
variable acts to translate or shift a property such as 
the optical phase or position of a probe state. Accu- 
rate estimation of the shifted parameter allows a cor- 
respondingly accurate measurement of the environmen- 
tal variable. For example, interferometric measurements 
of quantities ranging from temperature to gravitational 
wave amplitudes rely on the estimation of an optical 
phase shift. An important aim of quantum metrology 
is to determine the fundamental bounds on the resolu- 
tion of such estimates, and how these bounds scale with 
available resources such as energy (J-Q. 

Let us denote the initial probe state by the density op- 
erator poi an d the generator of shifts by some Hcrmitian 
operator G. Then if the shift parameter $ has the value 
</>, the final probe state is p^ = exp(— iG<p)po exp(iG0). 
In the following, particular attention will be paid to the 
estimation of a phase shift parameter, as this is sufficient 
for discussing various nonlinear estimation schemes pre- 
viously proposed in the literature [fil-[TlT|. The generator 
G in this case has integer eigenvalues, so that p^+2-K = P<p- 
More generally, however, our results apply to any shift 
generator G having a discrete eigenvalue spectrum. This 
includes the atomic scheme proposed in [12}, which has 
recently led to the first experimental demonstration of 
nonlinear quantum metrology (13(. 

Returning to an optical example, a linear phase shift of 
a single-mode optical probe state corresponds to G = N 
where N is the photon number operator. Similarly, for 
a probe state comprising m such modes, each under- 
going a nonlinear quadratic phase shift, one has G — 



(Ni) 2 + ■ ■ • + (N m ) 2 0, i|. In cases like this, we quan- 
tify the resources n by the total mean photon number 
J2j (Nj). Alternatively, for a probe comprising n atomic 

qubits, each with a Pauli Z operator Og , one may con- 
sider the generator G = and powers 
thereof, corresponding to linear and nonlinear Ramsey 
interferometers respectively @, H, OH [H| ■ Again, n quan- 
tifies the resources. 

We note that this quantification of resources n is dif- 
ferent from the N (which we will denote A/") used in 
Ref. HHHJ. 

The n used here typically corresponds to 
the conspicuous physical resources required to generate 
the probe state, and is what has previously been used 
to claim an advantage when using nonlinear interactions 

[Ml. 

If $ denotes an estimate of a shift parameter for 
some measurement scheme, then a standard measure 
[20j of the performance of the estimate is given by the 
average estimation error (called rms error in Ref. jl9|). 

:= y/E[($-*n (1) 
where the expectation value here is defined as 

£?[($- $) 2 ] = j d<l>4{j>-<t>?v{mp{4>)- (2) 

Here p(<f>\</>) is the probability density of the estimate con- 
ditioned on a fixed shift value $ = and p((f>) denotes 
the prior probablity density of the shift parameter. Mea- 
surement schemes which minimise the average estima- 
tion error, for given resources such as the average photon 
number or number of qubits available, are of fundamental 
interest in quantum metrology. 
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However, attention has often focused instead on min- 
imising a different quantity, the 'local precision', defined 
for a fixed value of the shift parameter, $ = 0, by [U [22[ 

*<*>-((ra- ■*)'),'• ,3) 

where denotes an average with respect to the con- 
ditional probability density p(0|</>). Some proposed non- 
linear measurement schemes can achieve local precisions 
which scale in terms of the number of resources n as, for 
example, n" 3 / 2 @, 0, S H P , n" 2 or 2~ n @, for 
some value of cf>. Even so, it will be shown below that 
the corresponding average estimation errors can scale no 
better than the usual Heisenberg scaling, n _1 . 

For estimates which are, approximately, locally unbi- 
ased for all values of <E> over some interval [Hj], one has 

y#p(0)P ($)«e(4) (4) 

for shift parameters confined to this interval, providing a 
simple connection to the average estimation error. How- 
ever, many phase estimates are unbiased only over very 
limited ranges [24| , where these ranges are of widths com- 
parable to the local precision itself. Thus, for example, 
while a high local precision of 2~ K in some region may 
allow the K th binary digit of a phase shift to be esti- 
mated, it often will not allow the preceding digits to be 
estimated with any accuracy. These must cither already 
be known (e.g., in phase tracking (2|| or phase sensing 
[2(j| applications), which requires the prior probability 
distribution p(4>) to be almost as narrow as the poste- 
rior distribution (after the measurement), or they must 
be determined using further resources. Hence, unless the 
phase is already very well known, the scaling of P<j>($>) 
may be a very poor guide to the scaling of e($). 

Indeed, whereas the local precision has a scaling lower 
bound set by the rms variance, AG, of the generator for 
the probe state [U, [22j , the average estimation error has 
an asymptotically stronger (i.e. higher) lower bound, set 
by the entropy, H(G), of the generator [H E|. Thus, 
maximising the variance, rather than the entropy, of G, 
will not typically minimise e(<&). Here we further gen- 
eralise and strengthen this entropic bound, in Sees. II 
and III, to replace H(G) by the so-called G-asymmetry 
of the probe state [2!|. The fundamental role of this 
quantity is emphasised by showing that it also bounds 
the mutual information between the shift parameter $ 
and any estimate $. An important consequence demon- 
strated in Sec. IV is that, in a surprising contrast to the 
case of local precision, simply replacing G by some non- 
linear function thereof, such as F = G 2 , cannot improve 
the average estimation error nor the information gain. 

A careful analysis in Sec. V shows that nonlinearity 
can improve the scaling of e($) beyond n _1 for iterative 
implementations. These are implementations where the 



shift is applied on a sequence of probes of different sizes, 
so that G is replaced by a suitable sum of nonlinear gen- 
erators. However, for a probe state comprising n qubits, 
it is shown that even adaptive variable-pass implemen- 
tations of previously proposed nonlinear schemes can at 
best achieve scalings exponential in y/n for the average 
estimation error. In contrast, in Sec. VI, we show that the 
best possible scaling for e($) is exponential in n, both for 
qubit and optical probes, whether or not the generator is 
linear or nonlinear. Moreover, an exponential scaling is 
in fact achievable via linear estimation schemes, if mul- 
tipass implementations are allowed. Whether nonlinear 
schemes are more robust than linear schemes to the pres- 
ence of loss is left as a question for future investigation. 

II. AN INFORMATION BOUND 

The mutual information between the shift parameter 
and its estimate, H(<& : $), is a measure of performance 
in its own right, quantifying the average number of bits 
obtained per estimate. A general upper bound for mu- 
tual information is obtained here, applicable to any gen- 
erator G having a discrete spectrum, which will be used 
in Sec. Ill below to obtain a lower bound for the average 
estimation error. Several useful properties of this bound 
are also established. 

Consider a parameter $ with some prior distribution 
p(4>), and define an average prior state ~p = J d(j)p((f>) p<j>. 
Then, using the Holevo bound [30j . one immediately has 
H (I : $) < S(p) ~fd0 p(</>) s( P4> ) = S(p) - S(p ). Here 
S(p) = — tr[plnp] denotes the von Neuman entropy of 
the state p. Now define 

1 r w 

W G (p) :=Vn 9P n 3 = lim - / #e-^pe^, (5) 

3 3 w^rco yj Jq 

where n g is the projection on to the eigenspace corre- 
sponding to eigenvalue g of G, and the second equal- 
ity may be checked by considering a basis diagonal in 
G. This map is unital, i.e., it maps the unit opera- 
tor to itself. Using the nondecreasing property of von 
Neumann entropy under unital maps [30(, together with 
Z-fc(p) = Ug(po), then yields the desired upper bound, 

ff(l> : $) < A G (po) := S(U G (p )) - S{p ), (6) 

for the mutual information. 

The upper bound, Ac{po) in Eq. ([6]), may be recog- 
nised as the increase in quantum entropy due to an 
ideal measurement of G on the probe state, with post- 
measurement state Uc(po)- This entropy increase is rele- 
vant to bounding efficiencies in quantum thermodynam- 
ics [3l|. More generally, Aq{p) represents the asymmetry 
of the state p with respect to a unitary group G (in this 
paper, the one-parameter Abelian group with Hermitian 
generator G) [29(. The G-asymmetry quantifies the de- 
gree to which po can break the symmetry of G (in this 
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paper, the extent to which it carries information about 
the variable $ which is conjugate to G) [2II l32T - l34l | . For 
the case where G has integer eigenvalues, A G (p ) quan- 
tifies to what extent po can act as a phase reference, an 
attribute clearly essential for detecting phase shifts. Note 
that for G with incommensurate eigenvalue gaps, the cor- 
responding group is non-compact, but the above expres- 
sion (|S|) for Uq allows one to generalise the G-asymmetry 
([6]) to this case also. 

A form of Eq. (JHJ) has been previously given for the spe- 
cial case of compact groups where the avera ge p rior state 
p is symmetric with respect to the group [32h34| . For 
the case of a phase-shift (i.e. a G with integer eigenval- 
ues) this means a prior distribution p((f>) which is uniform 
over the unit circle. Equation ([6]) represents a generalisa- 
tion, for the case of one-parameter groups, to an arbitrary 
discrete generator G and arbitrary prior distributions of 
the shift parameter. Note that $ ranges over (—00,00) 
if e~ lG ^ is nonperiodic, corresponding to a noncompact 
group. 

Several useful properties of A G (p) will be required fur- 
ther below. First, if p is pure and/or G is nondegenerate, 
then the states p g — HgpHg/p g , with p g = tr[pll g ], are 
pure and mutually orthogonal, and Eqs. ([5]) and ([6]) yield 

A G (p) = H(G\p)-S(p), (7) 

where H(G\p) = — ^2 g P g hip ff is the entropy of the gen- 
erator for state p. Second, one has the general bounds 

A f{G) (p)<A G (p)<H(G\p), (8) 

A G (Xp + (1 - \)p') < XA G (p) + (1 - X)A G (p') (9) 

where f{G) is any function of G and < A < 1. The 
lower bound in Eq. ((5|) is saturated when / is 1 : 1 , as may 
be seen by replacing G by f(G) and / by while the 
upper bound is saturated for any pure state from Eq. ||7J . 
The convexity of A G (p), as per Eq. ©, implies that the 
G-asymmetry is maximised for pure states. 

The lower bound in ([5]) is obtained by noting that the 
eigenspaces of G are subspaces of the eigenspaces of /(G), 
so that U G °Mf(G) = Wg, and using the nondecreasing 
property of von Neumann entropy under unital maps [301 ] 
for the particular case Ufr G ){p) (Ug °U}(g)){p)- To 
obtain the upper bound, let \ip) be some purification of 
p on a tensor product of the probe Hilbert space with an 
ancilla a, so that p = ti a [\ip)(if)\). Rewriting A G (p) as 
12 g PgS(p\\Pg)> where S(p\\a) — tr[<7(ln<7 — hip)] denotes 
the relative entropy of p and a, one then has 

H(G\p) = H{G®l\\i>)^\)=A Gm {\^\) 

= 5> fl S(M<vi ll(n 9 ® i)|VWI(n fl ® i)/ Pg ) 

g 
g 

as desired, where the second equality follows from Eq. (J7J, 
and the inequality from the nonincreasing property of 



relative entropy under the operation of tracing over the 
ancilla [30] . Finally, Eq. © may be obtained via the rep- 
resentation A G {p) = lim^_ i . 00 w~ l f Q d<t> S(p\\p ( j > ), follow- 
ing from Eq. ([5]), and using the joint convexity property 
of the relative entropy [30( . 

Equations ([6]) and |[8]) imply in particular that the mu- 
tual information is bounded by the entropy of the gener- 
ator for the probe state, i.e., 

: $) < H(G\p ). (10) 

Thus, for example, for a generator having d distinct 
eigenvalues, no more than lnd nats, i.e., log 2 d bits, of 
information can be extracted per probe state about the 
value of the shift parameter. 



III. BOUNDS FOR RESOLUTION OF SHIFT 
PARAMETERS 

A. Average estimation error 

A strong bound for the average estimation error in 
Eq. ([T]) may be derived analogously to weaker bounds 
obtained in [28|, [HI, i.e., by combining a quantum upper 
bound — such as Eq. © — for the mutual information 
with the classical lower bound [3(| [37j 

: $) > - -ln[27ree(&)], (11) 

where -ff($) = — / p(4>) ln[p(<p)]d<f) denotes the entropy 
of the prior probability density p((f>) for $. This lower 
bound is well known from rate-distortion theory, and fol- 
lows from the inequality chain [371] H(<& : $) = H($) - 
#($]$) = #($) - H($ - > - H(<S> - 4) > 

H($) — \ ln[27re e($)], where H(A\B) denotes the condi- 
tional entropy H{AB) - H(B). 

In particular, the combination of Eqs. © and (fTTj) 
immediately yields the fundamental bound 

e($) > (27re)- 1/2 e H ^ e ~ AG ^ (12) 

for the average estimation error, for any discrete gen- 
erator G. This bound both strengthens and generalises 
previous entropic bounds in the literature [171. 1271. \2§L [35| . 
For example, Nair [28[ and Yuen [51] use weaker upper 
bounds for the mutual information, corresponding to re- 
placing A G (po) in Eq. (1121) by the quantum channel ca- 
pacity under a fixed photon number constraint. Hall and 
co-workers have previously obtained bounds in a different 
manner, based on entropic uncertainty relations, which 
correspond to replacing A G (p) in Eq. (fT2"|) by the upper 
bound in Eq. © [13, [13] (and, alternatively, by th e up - 
per bound in Eq. (J7J for nondegenerate generators [27j). 
and replacing e^^*' by l/o max , where <7 ma x denotes the 
maximum value of p(4>) [271 ] . 
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Note that our bound (fT2|) is applicable to iterative 
schemes, including adaptive ones, where the measure- 
ment performed on some probe state components is de- 
pendent (in practice, through additional known phase 
rotations) on the outcomes of earlier measurements on 
other components [J]. This is because such a measure- 
ment scheme is formally equivalent to first applying shift 
generators G\ , G% , . . . to respective probe components 
(e.g., qubits or optical modes), corresponding to applying 
the total generator G = G% +G2 + ■ • ■ , and then perform- 
ing the measurements sequentially (and adaptively). 

B. Local precision 

A bound for the local precision in Eq. ([3]) follows via 
the quantum Cramer- Rao inequality 0, H| , and has the 
form [H HI 

P ($) > (2AG)" 1 , (13) 

where AG denotes the root mean square deviation of the 
(total) generator G for the probe state. Note that, taking 
the averages of v independent estimates, one obtains the 
usual statistical enhancement factor of 1/ \Jv for both of 
the bounds in (fT2l and Q13p . This will not be discussed 
further here, other than to remark that although the lat- 
ter bound for may be asymptotically achievable 
as v — >• 00 [U [24], [23 , this does not imply anything 
about the achievability of the corresponding bound for 

C. Comparisons 

For generators with integer eigenvalues, i.e., phase shift 
generators, the scaling of e($) with the exponentiated G- 
asymmetry e -j4(3 ( po ) in Eq. (fT2j) implies a scaling with the 
root mean square error AG which is at least as strong 
as that for Pcf,(&) in Eq. (fl"3)) (ignoring multiplicative 
constants of order unity). This is a consequence of the 
inequality chain 

e -Ao( P ) > e -H(G\ P ) > ( 27re )-l/2[( AG) 2 + 1/12J-V2 

(14) 

for such generators, where the first inequality follows 
from Eq. ([5]) and the second is well known [§3, [HI- 
For the case of a completely unknown phase shift, with 
p(cf>) = l/(27r), Eqs. dH]) and (H1J) yield the asymptotic 
lower bound e($) > (eAG) -1 for the average estima- 
tion error, which is comparable to the lower bound (fT5|) 
for local precision. However, importantly, the bound in 
Eq. (|12p is significantly more powerful, as we now show. 

Consider, for example, a probe comprising n qubits, 
and generator G = a^p + • • • + cri™' . The pure probe 
state (\z, z,...,z) + \—z, —z, . . . , — z))/v2) where | ± z) 
denotes the eigenstates of <r z , then gives the maximum 
possible value AG = n in Eqs. (|T3"|) and (|T4")) . That is why 



this state, equivalent to a NOON state or GHZ state [3!| , 
is often considered in quantum metrology. However, the 
corresponding G-asymmetry follows via Eq. ([7]) as only 
A G (p ) = H(G\p ) = hi2, implying via Eq. ^ that 
the average estimation error does not decrease at all as a 
function of n. The only way an average estimation error 
scaling as n _1 would be possible from this state would 
be if there was sufficient prior information. That is, from 
Eq. (fl"2|) , if — H(Q) + Ac(po) was of order Inn. But since 
Ag(po) = m 2, this means — if($) ~ Inn itself. Hence 
the amount of prior information about the parameter to 
be estimated would already be sufficient to locate it with 
the precision achievable by the measurement. 

It is thus apparent that the lower bounds (TT21 and 
(1131) can exhibit markedly different behaviour, with the 
former bound having an asymptotically stronger scaling 
in general. It follows that probe states generating opti- 
mal scaling for P^($), obtained by maximising AG under 
various constraints for some value of 4>, do not necessarily 
correspond to optimal bounds for e($). Since it is the lat- 
ter quantity which has direct operational significance for 
the performance of the estimate, this has crucial impli- 
cations for some nonlinear estimation schemes proposed 
in the literature, as will be seen below. 

IV. RESOLUTION PUZZLE: 
NONLINEARITY VS G-ASYMMETRY 

A. Probes comprising n qubits 

Several nonlinear phase estimation schemes have been 
proposed for Ramsey interferometry, based on a probe 
state comprising n qubits @-[ll]|. F° r example, defining 
J z '■= <Jg + ■ ■ ■ + o\z™\ the generator ( J z ) q has been con- 
sidered in @,H,[i3 for (7 = 2, 3, ... , and the generator nJ z 
in [10]. A nonlinearity equivalent to J^, although defined 
in terms of a fixed number of bosons shared between two 
modes, was considered in [5j. Further, the generators H 
and A defined via 

H + 1A = + i(7«) ® • • • ® (a^ + io-W) (15) 

were considered in 

Now, the linear generator G = J z has n + 1 dis- 
tinct eigenvalues, — n, — n + 2, . . . , n, and hence Ac(p) < 
H{G\p) < ln(n+l). It follows immediately from Eq. (fT2|) 
(see also Eq. (22) of [13]), that the corresponding aver- 
age estimation error can scale no better than (n + 1) _1 
with qubit number, corresponding asymptotically to the 
Heisenberg scaling limit of n . 

However, noting from Eq. that Af^ G ){p) < Ao(p), 
for any function /, precisely the same conclusion follows 
for the nonlinear generators G ~ (J z ) q and G = nJ z . 
That is, the average estimation error cannot achieve bet- 
ter than Heisenberg scaling for these generators. More- 
over, the nonlinear generators G = H and G = A of 
Eq. (|15p do not even allow the possibility of Heisenberg 
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scaling, as t hey each only have 3 distinct eigenvalues, 
and ±2"- 1 

The above results are in stark contrast to the best pos- 
sible scalings of local precision for these schemes, which 
improve on Heisenberg scaling, with n~ q for G — {J z ) q 
MM, S El; n~ 2 for G = nJ z [lp]; and 2"" for G = H 
or A [9]. This difference in scalings immediately raises 
a conundrum: how can nonlinearity improve the local 
precision, yet not the average estimation error? 

This puzzle may be further deepened by noting that 
the probe states yielding optimal local precisions are gen- 
erally an equally-weighted superposition of two orthog- 
onal eigenstates of G, corresponding to the maximum 
and minimum eigenvalues of the generator @, H, [HI E3 ■ 
Thus, Aq(po) = ln2 for such a probe state, implying 
that the average estimation error cannot decrease with n 
at all, as discussed in Sec. Ill C above. Indeed, Eq. (fTUj) 
implies that no more than 1 bit of information about the 
phase shift can be gained via such an 'optimal' probe 
state. 



contrast to the situation regarding the local precision. 
However, a careful analysis shows that with a suitable 
sum of nonlinear generators, the bounds in Eqs. (TT21 and 
© allow for an enhanced scaling of e(<t), and that this 
enhanced scaling could, plausibly, be achievable by adap- 
tive measurements. This resolves the above puzzle to 
some degree. Significantly, however, the scaling of the 
average estimation error does not necessarily achieve the 
same scaling as the local precision. 

As a first example, let G(l) denote the nonlinear gen- 
erator (J 2 ) 2 , for I qubits, and let p(l) denote an equally 
weighted superposition of two eigenstates of G(l), corre- 
sponding to its minimum and maximum eigenvalues (i.e., 
to and I 2 if I is even, and 1 and I 2 if I is odd). Now 
consider the total generator and corresponding composite 
probe state defined by 



K 



G it 



G{n k ) 



Po 



ip(nk) 



(16) 



k=l 



B. Probes comprising optical modes 

An analogous puzzle holds for optical probes. As a 
simple example, if G = A is the number operator for a 
single mode field, then H(G\p) < ln(e(iV + 1)), implying 
from Eqs. © and (TT2"j) that the average estimation error 
can scale no better than (TV + (see also (T3, H3, 

HI, HH). But for any nonlinear generator G = f(N), 
A G (p) < A N (p) < H(N\p) from Eq. ©. Hence, the 
same scaling bound also applies to nonlinear generators 
for single mode fields. 

In contrast, nonlinearity can significantly enhance the 
local precision. For example, choosing the coherent probe 
state po — \ct)(a\ and nonlinear generator G = A 2 , 
P^($) can scale as (A)~ 3 / 2 for large (A) [5]. For this 
case the photon number distribution is Poissonian, which 
is well approximated by a Gaussian distribution for large 
(A). Thus, using Eq. ©, A G (p ) = H(N 2 \p ) = 
H(N\p ) « (l/2)ln(27re(A)), implying via Eq. (TT2]) that 
the corresponding average estimation error can decrease 
with (A). However, it cannot scale even as well as the 
Heisenberg limit, (A) -1 , but rather is lower bounded by 
the standard quantum limit scaling, (A) -1 / 2 . 

These examples again lead to the puzzle that while 
nonlinearity can improve the scaling of the local preci- 
sion, it cannot, by itself, influence the scaling of the av- 
erage estimation error. This raises the question: can non- 
linear schemes offer any advantage over linear schemes? 



V. PUZZLE RESOLUTION: ITERATIVE 
SCHEMES 

It has been seen that a simple replacement of a genera- 
tor by a nonlinear function thereof cannot lead to an im- 
proved scaling of the average estimation error, in marked 



with nk := [2^ £_1 ^/ 2 ] (where \x] denotes the smallest 
integer not less than x). Since the phase shift generated 
by G{1) has period ~ 2tt/1 2 , this ensures that the phase 
shift generated by G(rifc), on the probe state component 
p(nk), has period « (27r)/2 fe_1 . 

The basic idea is that the kth bit in a binary expan- 
sion of $/(27r) is estimated from the fcth component of 
the probe state. Note that it is impossible to obtain 
more than 1 bit from each component of the probe state, 
i.e., more than In 2 nats, as a consequence of Eq. ([5]) and 
the property A G{l) {p{k)) = H(G(l)\p(l)) = In 2. This 
is the idea behind the famous quantum phase estima- 
tion algorithm 13011 fo r linear phase shifts, subsequently 
generalized in (bJQjJ. In practice, to achieve the best 
scaling for the average estimation error it may be nec- 
essary to use M > 1 copies of each component of the 
probe state to estimate each bit accurately, when com- 
bined with an adaptive measurement sequence [l4|, [H| ■ 
Counter-intuitively, it is the least significant (ATth) bit 
of $/(27r) that should be determined first, to allow the 
optimal measurement of the (K — l)th bit, and so on up 
to the most significant bit. 

The total number of qubits required in the above setup 
is n = MJ2 k nk ~ M(2 K / 2 - l)/(\/2- 1). Further, there 
are 2 K distinct eigenvalues of G, of the form J2 k b k 2 k ~ x 
for bk = or 1, where these have a uniform distribution 
for po. Taking into account the M copies of Git and 



po, the corresponding generator G = G\ 



(i) 



G 



(M) 



has eigenvalues ranging from to M{2 K — 1), implying 
a G-asymmetry A G (® M p ) < H(G\ <g) M p ) < ln[M2^] 
(note that for M > 1 the distribution of G over <S> r po is 
not uniform, and this upper bound is not tight). Thus 
Eq. (fT2f yields the following lower bound for the average 
estimation error, 



e( *)> (i 7) 

' " M[l + (V2 - l)n/M} 2 (3-2V2)" 2 ' " 
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in the worst-case scenario of a completely random phase 
shift. Note that this bound is compatible with the scaling 
expected for a scheme that determines the first K bits of 
$, giving e($) < (2tt)/2 k+1 ~ (M/n) 2 . In other words, 
for M large enough for this bitwise estimation scheme to 
work, we would expect the scaling with n in Eq. (|17j) to 
be attainable. 

Thus, this adaptive scheme demonstrates the possibil- 
ity of an asymptotic n~ 2 scaling for the average estima- 
tion error. This is the same scaling (up to a constant 
factor) as for the optimal local precision for the genera- 
tor G(n) @, . Furthermore, an n q asymptotic scaling 
can be obtained for an analogous adaptive scheme based 
on the nonlinear generator ( J z ) q , with 



n q 



(18) 



for a phase shift random over [0,2ir), where c q increases 
exponentially with q. Analogous results may be obtained 
for iterative implementations of the schemes in 0, @] . 

However, a correspondence of scalings between 
and £($) does not hold more generally. For example, 
let G(l) instead denote the nonlinear generator H + 2 l ~ 1 
for I qubits, with H as in Eq. (|15p but with / in place 
of n, and with the additive constant being chosen to 
simplify eigenvalue counting. Further, let p(l) denote 
an equally weighted superposition of the two eigenstates 
corresponding to the minimum and maximum eigenval- 
ues and 2 l of G(l) (note such superpositions include 
the separable states \z,z,...,z) and \—z,—z,...,—z) 
S S3])- Successive estimation of I binary digits then 
corresponds to rik = k — 1 in Eq. (|16p . requiring a to- 
tal qubit number n = MK(K - l)/2 rj MK 2 /2. One 
has Aq{® m Po) < hx{M2 K ) as before, yielding the lower 
bound 



e($) > (2^/e) 1 / 2 M- 1 2-V^ 



(19) 



for the case of a completely random phase shift. Thus 
the scaling with n is considerably worse than the 2~ n 
scaling of the local precision for the corresponding single 
generator scheme Q. 

This last result demonstrates that the local precision 
does not necessarily characterise the performance of the 
average estimation error even for adaptive implemen- 
tations. It follows that comparisons between various 
schemes, whether linear or nonlinear, should be made on 
the basis of the operationally significant quantity, e($) 
in Eq. flU, rather than P^($) in Eq. ©. 



VI. LINEAR SCHEMES WITH OPTIMAL 
SCALING 

A. Probes comprising n qubits 

Since any generator G for n qubits has at most 2" 
distinct eigenvalues, it follows from Eqs. ([7]) and ([5} that 



Ag(po) < H(G\po) < rain 2, with equality for a probe 
state that is an equally weighted superposition of the 
corresponding eigenstates. Hence, fromEq. (TT2")) , the best 
possible scaling for the average estimation error satisfies 



> (2 7 re)- 1 / 2 e *W 2" 



(20) 



Estimation schemes having an exponential scaling lin- 
ear in n are therefore of fundamental interest. Note, as 
per Eq. (|19[) . that such a scaling is not attained via an 
adaptive implementation of the nonlinear scheme in Q , 
despite the local precision scaling as 2~ n for this scheme. 

It is possible that nonlinear schemes exist with an ex- 
ponential scaling in n for e($). Here we show that, sur- 
prisingly, such a scaling can be achieved with linear gen- 
erators. The extra ingredient that makes this possible 
is to allow for multiple (and varied) applications of the 
phase shift prior to measurement. 

In particular, following the ideas in Higgins et al. 
[lij , consider a probe system comprising m unentangled 
qubits, each in the state |+) = (\—z) + \z))/\2, where 
the A:th qubit is subjected to 2 k ~ 1 applications of a linear 
phase shift generated by (1 + a i J c) )/2. In Ref. [H| this 
was achieved experimentally via multiple passes through 
a medium. Another possibility would be to suitably in- 
crease interaction times of the qubits with the phase shift 
medium. The total generator and the probe state there- 
fore have the forms 



Git 



K 

E 

k=l 



2 fc - 1 (l + ( 7«)/2, po 



■„K\ 



-)(- 



(21) 



The total phase shift of the fcth qubit thus has period 
2n/2 k ~ 1 , and so can be used to estimate the kth bit of 
$/(27r), in the adaptive manner explained above 

If, as for the nonlinear schemes above, M copies are 
used to estimate each bit accurately, then the total num- 
ber of qubits required is n — MK. Following the style 
of arguments used in the preceding section gives a lower 
bound on the the average estimation error of 



c($) > (2ne)-^ 2 e H ^M- 1 2-^ M . 



(22) 



In this case it would appear that minimizing M could 
make a big improvement to the precision, and for M = 1 
the ultimate scaling (|2U|) might be achievable. However 
it must be remembered that Eq. (f2"2"j) is merely a lower 
bound. Moreover, from the arguments in the preceding 
section, it is only for M sufficiently large that we expect 
these bitwise estimation schemes to attain the scaling 
with n of the lower bounds. 

Luckily, in this case, we can compare this bound to the 
actual performance of the best known adaptive schemes, 
for an initially completely random phase, as this has been 
studied extensively. These studies were done using the 
Holevo variance Vh(&) [H| rather than the average es- 
timation error, but when these are small (as here, for n 
large), V H ($) < e(*) 2 < (tt/2) 2 V^(I>) 42]. In terms 
of scaling with n, the best performances is indeed for 



7 



M = 1, which corresponds to the quantum phase esti 
mation algorithm (3(J and yields jl5| 

c x 2~ n/2 . 



e($) 



(23) 



(The constant c ~ 1.18 can be evaluated by perform- 
ing the integral of the distribution of phase estimates, 
Eq. (4.5) of |l5|.) Although this is not identical scal- 
ing to (|20|) . it is still exponential in n, unlike (Q~9|). For 
M = 2, 3, and 4, the performance scales as 2 _Tl / 4 , while 
for M > 4 it scales as 2~"/ M , achieving the lower bound 
scaling in Eq. (|22|) as expected for M sufficiently large. 

Note that in terms of J\f = M (2 K+1 — 1), the number 
of qubit-passes through the phase shift (which is the re- 
source considered in |14l - fl7l ]). the change in scaling as M 
increases appears quite different. For M = 1 and M = 2, 
the scaling is JV~ 1/2 ; for M = 3 it is JV~ 3/4 ; and for 
M > 4 it is jV _1 . We emphasize that counting resources 
as above, in terms of the number of qubits n, is neces- 
sary to enable consistent comparison with the nonlinear 
schemes considered in @-[llj]. Recently, some other pa- 
pers have also considered the number of qubits (or, more 
strictly, the number of qubit measurements) as a resource 
[iiJUil]. However, as these were motivated by qubit gate 
characterization in solid-state quantum computing, they 
imposed the constraint that the qubit measurement ba- 
sis is fixed. In this case the only thing that can be cho- 
sen adaptively is the number of times the phase shift is 



applied to a given qubit prior to measurement. In [43 1 
numerical evidence was presented that, using a locally 
optimal ("greedy") adaptive algorithm, a scaling of ap- 
proximately 2~ al " is achievable. In [44| an analytical 
argument was given suggesting that a scaling of approxi- 
mately 2 _016n should be achievable. Neither achieves the 
scaling (|23p of schemes that allows adaptive controlled 
qubit rotations prior to measurement. 



B. Probes comprising optical modes 

For a optical probe containing to orthogonal modes, 
let N m denote the photon number of the mth mode, and 
N denote the total photon number N\ + ■ ■ ■ + N m . The 
entropy of any generator G = f(Ni, . . . , N m ) is bounded 
above by the joint entropy of Ni, . . . , N m (since the dis- 
tribution of G is a coarse graining of the joint distribu- 
tion), yielding via Eq. flSJ) 



A G (p) < H(G\ P ) < H(Ni 
(N) 



< mm 



< ln« 



1 



-<7V) 



...,N m \p) 
(N) In 



777 

(TV) 



(24) 
(25) 



The second line follows from standard statistical mechan- 
ics techniques, and the third line from the monotonic 
convergence of (1 + x/y) v to e x as y increases. 

Using Eq. (fl"2"|) . the average estimation error therefore 
has the fundamental lower bound 



for any generator which is a function of Ni, . . . , N m . This 
includes the total photon number, N = Ni + ■ ■ ■ + N m , in 
particular 35], but also includes, for example, the nonlin- 



ear generators N and (iVi 



■ • (N m ) . Similarly, using 



e($) > (2?re) 



-l/2 e H(») e - 



-(N) 



(26) 



Eq. ([6J , one has the fundamental upper bound m + (N) 
for the mutual information iJ($ : $). 

It follows that estimation schemes with an exponential 
scaling linear in the average photon number are of fun- 
damental interest. Further, a linear scheme, analogous 
to the one above for 77 qubits, is sufficient to obtain such 
a scaling (though with a different coefficient). In partic- 
ular, the M = 1 linear multipass scheme of Higgins et al. 
[14| . equivalent to the quantum phase estimation algo- 
rithm [30], is precisely such a scheme, involving m = 2K 
modes, with each pair of modes in the superposition state 
(|0)|1) - |1)|0))/V2. Hence, K = (N) = to/2, and from 
Eq. ([23]), e(<£) ~ 1.18 x e -K\n(2)/2 aS y m ptotically, which 
is consistent with Eq. ([2"B)) as here m+(N) — 3K. Again, 
it should be noted, as for the qubit case above, the mea- 
sure of resources considered here is the total mean pho- 
ton number (N) required for the scheme, rather than 
the number of photon-passes Af through the phase shift 
medium as in Ref. [14] ■ Again, we use (N) to enable 
comparison with various linear and nonlinear estimation 
schemes Q. Moreover, photon number is a natural 
measure of interest to consider, as it characterises the 
energy resources required for a given optical scheme. 



VII. DISCUSSION 

The average estimation error and mutual information 
have been shown to satisfy the general entropic bounds 
in Eqs. © and ([T21 . for any shift generator having a 
discrete spectrum, and for any prior distribution of the 
shift parameter. While, for phase shift generators, the 
G-asymmetry can be bounded above in terms of the vari- 
ance of the generator, via Eq. (fT4"|) . the G-asymmetry is 
typically much less than this upper bound. Hence, the 
average estimation error can scale very differently to the 
local precision in Eq. (I13[) . in terms of available resources 
such as number of qubits or total input photon number. 

Indeed, somewhat surprisingly, a simple replacement of 
a linear generator by some nonlinear function thereof may 
have no effect on the average estimation error, yet lead 
to a marked improvement of the local precision. Further, 
while such scaling differences can disappear for iterative 
estimation schemes, this is not always the case. 

It follows that the optimal scaling of the local pre- 
cision, for a given value of the shift parameter, should 
be treated with some caution. As noted in relation to 
Eq. @ , the local precision is a direct measure of the aver- 
age root mean square error only over an interval for which 
the corresponding estimate is (approximately) unbiased. 
However, many estimators in the literature are unbiased 
only over a very small interval, similar in magnitude to 
the local precision itself. In such a case (which is relevant, 
for example, in phase tracking and phase sensing appli- 
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cations [H, for the optimal scaling to be achievable, 
the amount of prior information required about the shift 
parameter is typically so great that the estimate itself 
can only extract up to 1 bit of further information, irre- 
spective of the number of resources n. Examples of this 
phenomenon have been given in Sees. Ill C and IV. 

It is concluded from the above that meaningful com- 
parisons between various estimation schemes are most 
easily made on the basis of the operationally significant 
quantity, e($) in Eq. (fl]), rather than in Eq. ((3|). 

Alternatively, if -P^'($) for some 4>' is used, then it should 
be supplemented by the interval over which this esti- 
mate is (approximately) locally unbiased, i.e., the inter- 
val for which the local precision of the estimate corre- 
sponds to the actual root-mean-square error of the es- 
timate, (($ — (j)) 2 )^- Note that the width of this inter- 
val also bounds the width of any 'sweet spot' for which 
P^($) = P0'($), and hence bounds the width of the prior 
phase probability density required to ensure a precision 
of ($) is actually achieved via measurement. As per 
Eq. Q , the estimation error averaged over this prior dis- 
tribution will then (approximately) be equal to P$'{<&). 

Universal lower bounds for the average estimation er- 
ror have also been given, in terms of the number of qubits 
or photons available, in Eqs. (|2T))) and (j2"o) . Like Eq. (fl"2"j). 
these bounds are independent of the form of the gener- 
ator, and hence apply equally well to both linear and 
nonlinear schemes, including multipass schemes. They 
imply that it is impossible to achieve a scaling better 
than exponential, in terms of qubit number, and in terms 
of input photon number plus number of modes, respec- 
tively. Further, exponential scalings can be attained by 
multipass linear estimation schemes, as has indeed been 
shown experimentally for optical phase shifts [l4| • It fol- 
lows that, in terms of the best possible scaling that can 
be achieved relative to qubit or input photon number, 
nonlinear schemes offer no fundamental advantage over 
linear schemes if multiple passes are possible. 



However, a practical advantage of nonlinear schemes 
may be a greater robustness to loss. For example, mul- 
tipass linear schemes of the type discussed above will be 
highly sensitive to loss due to multiple (or longer) inter- 
actions with the phase shift medium. Thus it would be 
interesting to find alternative physical implementations 
of the generator in Eq. (|2"Tj) and its optical analogue. Fur- 
ther, while only shot-noise scaling is achievable for simple 
linear schemes in the presence of loss 0,51,1111 (includ mg 
for the average estimation error [Hj]), there is evidence 
this may not be the case for nonlinear schemes [6] . Hence 
further investigation is required, including the determi- 
nation of fundamental scaling bounds for lossy schemes 
analogous to Eqs. (|2"U)) and ([26]) . 

It would also be of interest to investigate the degree to 
which results generalise to the case of a generator with 
a continuous spectrum, such as spatial translations gen- 
erated by a momentum operator. For example, the fun- 
damental bounds (|2"U1) and ([21))) are universal, since n In 2 
and (m+(N)) In e are respective upper bounds for mutual 
information, following via the Holevo bound. The possi- 
bility of other generalisations is supported by results such 
as a universal Heisenberg-type scaling for the average es- 
timation error, in terms of (|G|), which holds for both 
discrete and continuous generators [13] ■ Further, weaker 
measurement-dependent entropic bounds on the average 
estimation error, given in [271 ] . may prove helpful. Note 
that some differences are to be expected regarding lin- 
earity vs nonlinearity for the continuous case, since, for 
example, the property H(f(G)\p) < H(G\p) no longer 
holds in general. 
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